{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wFPyjGqMQ82Q"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "aNZ7aEDyQIYU"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uMOmzhPEQh7b"
      },
      "source": [
        "# Normalizations\n",
        "\n",
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/addons/tutorials/layers_normalizations\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/addons/blob/master/docs/tutorials/layers_normalizations.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/addons/blob/master/docs/tutorials/layers_normalizations.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a href=\"https://storage.googleapis.com/tensorflow_docs/addons/docs/tutorials/layers_normalizations.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
        "  </td>\n",
        "</table>\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cthm5dovQMJl"
      },
      "source": [
        "## Overview\n",
        "This notebook gives a brief introduction into the [normalization layers](https://github.com/tensorflow/addons/blob/master/tensorflow_addons/layers/normalizations.py) of TensorFlow. Currently supported layers are:\n",
        "* **Group Normalization** (TensorFlow Addons)\n",
        "* **Instance Normalization** (TensorFlow Addons)\n",
        "* **Layer Normalization** (TensorFlow Core)\n",
        "\n",
        "The basic idea behind these layers is to normalize the output of an activation layer to improve the convergence during training. In contrast to [batch normalization](https://keras.io/layers/normalization/) these normalizations do not work on batches, instead they normalize the activations of a single sample, making them suitable for recurrent neual networks as well. \n",
        "\n",
        "Typically the normalization is performed by calculating the mean and the standard deviation of a subgroup in your input tensor. It is also possible to apply a scale and an offset factor to this as well.\n",
        "\n",
        "\n",
        "$y_{i} = \\frac{\\gamma ( x_{i} - \\mu )}{\\sigma }+ \\beta$\n",
        "\n",
        "$ y$ : Output\n",
        "\n",
        "$x$ : Input\n",
        "\n",
        "$\\gamma$ : Scale factor\n",
        "\n",
        "$\\mu$: mean\n",
        "\n",
        "$\\sigma$: standard deviation\n",
        "\n",
        "$\\beta$: Offset factor\n",
        "\n",
        "\n",
        "The following image demonstrates the difference between these techniques. Each subplot shows an input tensor, with N as the batch axis, C as the channel axis, and (H, W)\n",
        "as the spatial axes (Height and Width of a picture for example). The pixels in blue are normalized by the same mean and variance, computed by aggregating the values of these pixels.\n",
        "\n",
        "![](https://github.com/shaohua0116/Group-Normalization-Tensorflow/raw/master/figure/gn.png)\n",
        "\n",
        "Source: (https://arxiv.org/pdf/1803.08494.pdf)\n",
        "\n",
        "The weights gamma and beta are trainable in all normalization layers to compensate for the possible lost of representational ability. You can activate these factors by setting the `center` or the `scale` flag to `True`. Of course you can use `initializers`, `constraints` and `regularizer` for `beta` and `gamma` to tune these values during the training process. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "I2XlcXf5WBHb"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kTlbneoEUKrD"
      },
      "source": [
        "### Install Tensorflow 2.0 and Tensorflow-Addons"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "_ZQGY_ALnirQ"
      },
      "outputs": [],
      "source": [
        "!pip install -U tensorflow-addons"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7aGgPZG_WBHg"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "import tensorflow_addons as tfa"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u82Gz_gOUPDZ"
      },
      "source": [
        "### Preparing Dataset"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "3wso9oidUZZQ"
      },
      "outputs": [],
      "source": [
        "mnist = tf.keras.datasets.mnist\n",
        "\n",
        "(x_train, y_train),(x_test, y_test) = mnist.load_data()\n",
        "x_train, x_test = x_train / 255.0, x_test / 255.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UTQH56j89POZ"
      },
      "source": [
        "## Group Normalization Tutorial \n",
        "\n",
        "### Introduction\n",
        "Group Normalization(GN) divides the channels of your inputs into smaller sub groups and normalizes these values based on their mean and variance. Since GN works on a single example this technique is batchsize independent. \n",
        "\n",
        "GN experimentally scored closed to batch normalization in image classification tasks. It can be beneficial to use GN instead of Batch Normalization in case your overall batch_size is low, which would lead to bad performance of batch normalization  \n",
        "\n",
        "###Example\n",
        "Splitting 10 channels after a Conv2D layer into 5 subgroups in a standard \"channels last\" setting:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aIGjLwYWAm0v"
      },
      "outputs": [],
      "source": [
        "model = tf.keras.models.Sequential([\n",
        "  # Reshape into \"channels last\" setup.\n",
        "  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n",
        "  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n",
        "  # Groupnorm Layer\n",
        "  tfa.layers.GroupNormalization(groups=5, axis=3),\n",
        "  tf.keras.layers.Flatten(),\n",
        "  tf.keras.layers.Dense(128, activation='relu'),\n",
        "  tf.keras.layers.Dropout(0.2),\n",
        "  tf.keras.layers.Dense(10, activation='softmax')\n",
        "])\n",
        "\n",
        "model.compile(optimizer='adam',\n",
        "              loss='sparse_categorical_crossentropy',\n",
        "              metrics=['accuracy'])\n",
        "model.fit(x_test, y_test)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QMwUfJUib3ka"
      },
      "source": [
        "## Instance Normalization Tutorial\n",
        "### Introduction\n",
        "Instance Normalization is special case of group normalization where the group size is the same size as the channel size (or the axis size).\n",
        "\n",
        "Experimental results show that instance normalization performs well on style transfer when replacing batch normalization. Recently, instance normalization has also been used as a replacement for batch normalization in GANs.\n",
        "\n",
        "### Example\n",
        "Applying InstanceNormalization after a Conv2D Layer and using a uniformed initialized scale and offset factor."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6sLVv-C8f6Kf"
      },
      "outputs": [],
      "source": [
        "model = tf.keras.models.Sequential([\n",
        "  # Reshape into \"channels last\" setup.\n",
        "  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n",
        "  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n",
        "  # LayerNorm Layer\n",
        "  tfa.layers.InstanceNormalization(axis=3, \n",
        "                                   center=True, \n",
        "                                   scale=True,\n",
        "                                   beta_initializer=\"random_uniform\",\n",
        "                                   gamma_initializer=\"random_uniform\"),\n",
        "  tf.keras.layers.Flatten(),\n",
        "  tf.keras.layers.Dense(128, activation='relu'),\n",
        "  tf.keras.layers.Dropout(0.2),\n",
        "  tf.keras.layers.Dense(10, activation='softmax')\n",
        "])\n",
        "\n",
        "model.compile(optimizer='adam',\n",
        "              loss='sparse_categorical_crossentropy',\n",
        "              metrics=['accuracy'])\n",
        "model.fit(x_test, y_test)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qYdnEocRUCll"
      },
      "source": [
        "## Layer Normalization Tutorial\n",
        "### Introduction\n",
        "Layer Normalization is special case of group normalization where the group size is 1. The mean and standard deviation is calculated from all activations of a single sample.\n",
        "\n",
        "Experimental results show that Layer normalization is well suited for Recurrent Neural Networks, since it works batchsize independt.\n",
        "\n",
        "### Example\n",
        "\n",
        "Applying Layernormalization after a Conv2D Layer and using a scale and offset factor. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Fh-Pp_e5UB54"
      },
      "outputs": [],
      "source": [
        "model = tf.keras.models.Sequential([\n",
        "  # Reshape into \"channels last\" setup.\n",
        "  tf.keras.layers.Reshape((28,28,1), input_shape=(28,28)),\n",
        "  tf.keras.layers.Conv2D(filters=10, kernel_size=(3,3),data_format=\"channels_last\"),\n",
        "  # LayerNorm Layer\n",
        "  tf.keras.layers.LayerNormalization(axis=3 , center=True , scale=True),\n",
        "  tf.keras.layers.Flatten(),\n",
        "  tf.keras.layers.Dense(128, activation='relu'),\n",
        "  tf.keras.layers.Dropout(0.2),\n",
        "  tf.keras.layers.Dense(10, activation='softmax')\n",
        "])\n",
        "\n",
        "model.compile(optimizer='adam',\n",
        "              loss='sparse_categorical_crossentropy',\n",
        "              metrics=['accuracy'])\n",
        "model.fit(x_test, y_test)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "shvGfnB0WpQQ"
      },
      "source": [
        "## Literature\n",
        "[Layer norm](https://arxiv.org/pdf/1607.06450.pdf)\n",
        "\n",
        "[Instance norm](https://arxiv.org/pdf/1607.08022.pdf)\n",
        "\n",
        "[Group Norm](https://arxiv.org/pdf/1803.08494.pdf)\n",
        "\n",
        "[Complete Normalizations Overview](http://mlexplained.com/2018/11/30/an-overview-of-normalization-methods-in-deep-learning/)"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "layers_normalizations.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
